<!DOCTYPE html PUBLIC "-//w3c//dtd html 4.0 transitional//en">
<html>
<head>
  <meta http-equiv="Content-Type"
 content="text/html; charset=iso-8859-1">
  <meta name="Author" content="Ralph Grishman">
  <meta name="GENERATOR"
 content="Mozilla/4.7 [en]C-CCK-MCD NSCPCD47  (Win95; I) [Netscape]">
  <title>Noun Grooup Chunker</title>
  <meta content="Ralph Grishman" name="author">
</head>
<body text="#000000" bgcolor="#fff0f0" link="#ff0000" vlink="#800080"
 alink="#0000ff">
<h2>
<font face="Arial Alternative"><font color="#3333ff">Noun Group Chunker<br>
</font></font></h2>
<br>
<table style="text-align: left; width: 500px;" border="1"
 cellspacing="2" cellpadding="2">
  <tbody>
    <tr>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 200px;">action
name<br>
      </td>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 300px;"><span
 style="font-family: monospace;">chunk</span><br>
      </td>
    </tr>
    <tr>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 200px;">resources
required<br>
      </td>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 300px;"><span
 style="font-style: italic;">chunk model (max ent probabilities)</span><br>
      </td>
    </tr>
    <tr>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 200px;">properties<br>
      </td>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 300px;"><span
 style="font-family: monospace;">Chunker.fileName</span><span
 style="font-family: monospace;"></span><span
 style="font-style: italic;"><br>
      </span> </td>
    </tr>
    <tr>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 200px;">annotations
required<br>
      </td>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 300px;"><span
 style="font-family: monospace;">token<br>
tagger<br>
ENAMEX</span><span style="font-style: italic;"></span><br>
      </td>
    </tr>
    <tr>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 200px;">annotations
added<br>
      </td>
      <td
 style="vertical-align: top; background-color: rgb(153, 255, 153); width: 300px;"><span
 style="font-family: monospace;">ng</span><br>
      </td>
    </tr>
  </tbody>
</table>
<br>
The statistical noun group chunker is an English chunker trained on the
Penn
TreeBank.&nbsp; It identifies noun groups (consisting of a head noun
and its left modifiers) and assigns an <span
 style="font-family: monospace; font-weight: bold;">ng</span>
annotation to each noun group.&nbsp; The chunker uses a maximum entropy
model whose features are specific lexical tokens and their (Penn)
part-of-speech, obtained from <span
 style="font-family: monospace; font-weight: bold;">tagger </span>annotations.&nbsp;
If name annotations (<span
 style="font-family: monospace; font-weight: bold;">ENAMEX</span>) are
present, the chunker will insure that noun groups boundaries do not
appear in the middle of names.<br>
<br>
</body>
</html>
